skip to main content


Search for: All records

Creators/Authors contains: "Li, Yuliang"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available June 4, 2024
  2. Dataset discovery from data lakes is essential in many real application scenarios. In this paper, we propose Starmie, an end-to-end framework for dataset discovery from data lakes (with table union search as the main use case). Our proposed framework features a contrastive learning method to train column encoders from pre-trained language models in a fully unsupervised manner. The column encoder of Starmie captures the rich contextual semantic information within tables by leveraging a contrastive multi-column pre-training strategy. We utilize the cosine similarity between column embedding vectors as the column unionability score and propose a filter-and-verification framework that allows exploring a variety of design choices to compute the unionability score between two tables accordingly. Empirical results on real table benchmarks show that Starmie outperforms the best-known solutions in the effectiveness of table union search by 6.8 in MAP and recall. Moreover, Starmie is the first to employ the HNSW (Hierarchical Navigable Small World) index to accelerate query processing of table union search which provides a 3,000X performance gain over the linear scan baseline and a 400X performance gain over an LSH index (the state-of-the-art solution for data lake indexing). 
    more » « less
  3. The emergence of novel hardware accelerators has powered the tremendous growth of machine learning in recent years. These accelerators deliver incomparable performance gains in processing high-volume matrix operators, particularly matrix multiplication, a core component of neural network training and inference. In this work, we explored opportunities of accelerating database systems using NVIDIA’s Tensor Core Units (TCUs). We present TCUDB, a TCU-accelerated query engine processing a set of query operators including natural joins and group-by aggregates as matrix operators within TCUs. Matrix multiplication was considered inefficient in the past; however, this strategy has remained largely unexplored in conventional GPU-based databases, which primarily rely on vector or scalar processing. We demonstrate the significant performance gain of TCUDB in a range of real-world applications including entity matching, graph query processing, and matrix-based data analytics. TCUDB achieves up to 288× speedup compared to a baseline GPU-based query engine. 
    more » « less
  4. null (Ed.)
  5. null (Ed.)
  6. Cognitive complications persist in antiretroviral therapy(ART)-treated people with HIV. However, the pattern and severity of domain- specific cognitive performance is variable and may be exacerbated by ART-mediated neurotoxicity. 929 women with HIV(WWH) from the Women’s Interagency HIV Study who were classified into subgroups based on sociodemographic and longitudinal behavioral and clinical data using semi-parametric latent class trajectory modelling. Five subgroups were comprised of: 1) well-controlled HIV with vascular comorbidities(n = 116); 2) profound HIV legacy effects(CD4 nadir <250 cells/μL; n = 275); 3) primarily <45 year olds with hepatitis C(n = 165); 4) primarily 35–55 year olds(n = 244), and 5) poorly-controlled HIV/substance use(n = 129). Within each subgroup, we fitted a constrained continuation ratio model via penalized maximum likelihood to examine adjusted associations between recent ART agents and cognition. Most drugs were not associated with cognition. However, among the few drugs, non- nucleoside reverse transcriptase inhibitor (NNRTIs) and protease inhibitors(PIs) were most commonly associated with cognition, followed by nucleoside reverse transcriptase inhibitors(NRTIs) and integrase inhibitors(IIs). Directionality of ART-cognition associa- tions varied by subgroup. Better psychomotor speed and fluency were associated with ART for women with well-controlled HIV with vascular comorbidities. This pattern contrasts women with profound HIV legacy effects for whom poorer executive function and fluency were associated with ART. Motor function was associated with ART for younger WWH and primarily 35–55 year olds. Memory was associated with ART only for women with poorly-controlled HIV/substance abuse. Findings demonstrate interindividual variability in ART-cognition associations among WWH and highlight the importance of considering sociodemographic, clinical, and behavioral factors as an underlying contributors to cognition. 
    more » « less